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6.2 Iterative Methods 

New solution methods are needed when a problem Ax = b is too large and expensive 
for ordinary elimination. We are thinking of sparse matrices A, so that multiplications 
Ax are relatively cheap. If A has at most p nonzeros in every row, then Ax needs 
at most pn multiplications. Typical applications are to large finite difference or 
finite element equations, where we often write A = K. 

We are turning from elimination to look at iterative methods. There are really 
two big decisions, the preconditioner P and the choice of the method itself: 

1. A good preconditioner P is close to A but much simpler to work with. 

2. Options include pure iterations (6.2), multigrid (6.3), and Krylov 
methods (6.4), including the conjugate gradient method. 

Pure iterations compute each new Xk+i from Xk — P^^{Axk — b). This is called 
'^stationary" because every step is the same. Convergence to Xoo = A~^b, studied 
below, will be fast when all eigenvalues of M = I — P^^A are small. It is easy to 
suggest a preconditioner, but not so easy to suggest an excellent P {Incomplete LU 
is a success). The older iterations of Jacobi and Gauss-Seidel are less favored (but 
they are still important, you will see good points and bad points). 

Multigrid begins with Jacobi or Gauss-Seidel iterations, for the one job that they 
do well. They remove high frequency components (rapidly oscillating parts) to leave 
a smooth error. Then the central idea is to move to a coarser grid — where the rest 
of the error can be destroyed. Multigrid is often dramatically successful. 

Krylov spaces contain all combinations of b, Ab, A%, . . . and Krylov methods 
look for the best combination. Combined with preconditioning, the result is terrific. 
When the growing subspaces reach the whole space R", those methods give the 
exact solution A'^^b. But in reality we stop much earlier, long before n steps are 
complete. The conjugate gradient method [for positive definite A, and with a good 
preconditioner) has become truly important. 

The goal of numerical linear algebra is clear: Find a fast stable algorithm that 
uses the special properties of the matrix. We meet matrices that are symmetric 
or triangular or orthogonal or tridiagonal or Hessenberg or Givens or Householder. 
Those are at the core of matrix computations. The algorithm doesn't need details 
of the entries (which come from the specific application). By concentrating on the 
matrix structure, numerical linear algebra offers major help. 

Overall, elimination with good numbering is the first choice ! But storage and 
CPU time can become excessive, especially in three dimensions. At that point we 
turn from elimination to iterative methods, which require more expertise than K\F. 
The next pages aim to help the reader at this frontier of scientific computing. 
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Stationary Iterations 

We begin with old-style pure stationary iteration. The letter K will be reserved for 
"Krylov" so we leave behind the notation KU = F. The linear system becomes Ax = 
b. The large sparse matrix A is not necessarily symmetric or positive definite: 

Linear system Ax = b Residual = b — Axk Preconditioner P ^ A 

The preconditioner P attempts to be close to A and still allow fast iterations. The 
Jacobi choice P = diagonal of A is one extreme (fast but not very close). The other 
extreme is P = A (too close). Splitting the matrix A gives a new form of Ax = b: 

Splitting Px = {P - A)x + h. (1) 

This form suggests an iteration, in which every vector Xk leads to the next 0:^+1: 

Iteration Px^j^^ = (P — A)xk + b . (2) 

Starting from any xq, the first step finds Xi from Pxi = {P — A)xo + b. The iteration 
continues to X2 with the same matrix P, so it often helps to know its triangular factors 
in P = LU. Sometimes P itself is triangular, or L and U are approximations to the 
triangular factors of A. Two conditions on P make the iteration successful: 

1. The new Xk+i must be quickly computable. Equation (2) must be fast to solve. 

2. The errors = x — Xk should approach zero as rapidly as possible. 

Subtract equation (2) from (1) to find the error equation. It connects to et+i'- 

Error Pck+i = (P — A)ek which means efc_|_i = (/ — P~^A)ei, = Me^- (3) 

The right side b disappears in this error equation. Each step multiplies the error 
vector Cfc by M. The speed of convergence of Xk to x (and of to zero) depends 
entirely on M. The test for convergence is given by the eigenvalues of M: 



Convergence test 


Every eigenvalue of M = I — P ^A must have A(M) < 1. 







The largest eigenvalue (in absolute value) is the spectral radius p{M) = max |A(M)|. 
Convergence requires p(M) < 1. The convergence rate is set by the largest eigen- 
value. For a large problem, we are happy with p{M) = .9 and even p{M) = .99. 



When the initial error cq happens to be an eigenvector of M, the next error is 
ei = Meo = Acq. At every step the error is multiplied by A. So we must have 
|A| < 1. Normally cq is a combination of all the eigenvectors. When the iteration 
multiplies by M, each eigenvector is multiplied by its own eigenvalue. After k steps 
those multipliers are A'^, and the largest is (p(M))'^. 

If we don't use a preconditioner then M = I — A. All the eigenvalues of A must 
be inside a unit circle centered at 1, for convergence. Our second difference matrices 
A = K would fail this test {I — K is too large). The first job of a preconditioner is 
to get the matrix decently scaled. Jacobi will now give p{I — ^K) < 1, and a really 
good P will do more. 
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Jacobi Iterations 



For preconditioner we first propose a simple choice: 



Jacobi iteration 


P = diagonal part D oi A 









Typical examples have spectral radius p{M) = 1 —cN~'^, where N counts meshpoints 
in the longest direction. This comes closer and closer to 1 (too close) as the mesh is 
refined and increases. But Jacobi is important, it does part of the job. 

For our tridiagonal matrices K, Jacobi's preconditioner is just P = 21 (the diago- 
nal of K). The Jacobi iteration matrix becomes M — I — D~^A — I — \K: 



Iteration matrix 
for a Jacobi step 



M 



I--K 
2 



1 
2 



1 

1 1 

1 1 
1 



(4) 



Here is x"^* from in detail. You see how Jacobi shifts the off-diagonal entries 
of A to the right-hand side, and divides by the diagonal part D = 21: 
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(5) 



The equation is solved when x"^* 



X 



old 



, but this only happens in the limit. The real 



question is the number of iterations to get close to convergence, and that depends on 
the eigenvalues of M. How close is A max to 1 ? 

Those eigenvalues are simple cosines. In Section 1.5 we actually computed all 
the eigenvalues \{K) = 2 — 2cos(;^^^). Since M = I — jK, we now divide those 
eigenvalues by 2 and subtract from 1. The eigenvalues cos jO of M are less than 1 ! 

: - 2 cos j6' 



Jacobi eigenvalues Aj(M) = 1 



cos jO with 9 



TT 



Convergence is safe (but slow) because | cos 6*1 < 1. Small angles have cos^^ ~ 1 
The choice j = 1 gives us the first (and largest) eigenvalue of M: 



Spectral radius 



Amax(M) =cos^^^ 1 



TT 



N+1 



(6) 



(7) 



Convergence is slow for the lowest frequency. The matrix M in (4) has the 

four eigenvalues cos |, cos cos and cos ^ (which is — cos |) in Figure 6.8. 

There is an important point about those Jacobi eigenvalues Xj{M) = cos j6. The 
magnitude \Xj\ at j = N is the same as the magnitude at j = 1. This is not good for 
multigrid, where high frequencies need to be strongly damped. So weighted Jacobi 
has a valuable place, with a weighting factor u; in M = / — ujD^^A: 
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Jacobi weighted by ui 



High frequency smoothing 



1 
3 




Figure 6.8: The eigenvalues of Jacobi's M = I — are cos j9, starting near A = 1 
and ending near A = —1. Weighted Jacobi has X = 1 — u + u cos j6, ending near 
A = 1 - 2cj. Both graphs show j = 1, 2, 3, 4 and d = = ^ (with = |). 

Jacobi's iteration matrix M = I — D~^A changes to M = I — ujD~^ A. 

The preconditioner is now P = D/uj. Here are the eigenvalues A(M) when A = K: 



Weighted 
Jacobi 



D = 21 and M = I A and uo < 1 

2 



uo 



Xj{M) = 1 (2 -2 cos j^) = l-u) + u: cos jO. 



(8) 



The dashed-line graph in Figure 6.8 shows these values Aj(M) for = |. This u is 
optimal in damping the high frequencies {jO between 7r/2 and tt) by at least |: 

vr 2 1 

A(M) = l-uj + uj cos - = !-- = - 

4 



TT 



At jO = - 
2 



At j6 = TT 



A(M) = l- u; + u;cos7r = l- 



1 
3 



If we move away from u; = |, one of those eigenvalues will increase in magnitude. A 
weighted Jacobi iteration will be a good smoother within multigrid. 

In two dimensions the picture is essentially the same. The eigenvalues of 
K2T) are the sums + of the N eigenvalues of K. All eigenvectors are 
samples of sin j'ttx sin kiry. (In general, the eigenvalues of kron(A, B) are Xj{A)Xk{B). 
For kron(i^', /) + kron(/, K), sharing eigenvectors means we can add eigenvalues.) 

Jacobi has P = 41, from the diagonal of K2B. So M2D = /2D - lK2B: 



1 



1 



A^fc(M2D) = 1 - - [XjiK) + Xk{K)] = -cosje+-coske. 



(9) 



With j = k = 1, the spectral radius Amax(^) = cos 6* is the same 1 — cN~'^ as in ID 
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Numerical Experiments 

The multigrid method grew out of the slow convergence of Jacobi iterations. You 
have to see how typical error vectors begin to decrease and then stall. For weighted 

Jacobi, the high frequencies disappear long before the low frequencies. Figure 6. 

shows a drop between eo and e , and then very slow decay toward Coo = 0. We have 

chosen the second difference matrix A = K^oo and started the iterations with the 
right-hand side Xq = b = rand(500, 1). 

This unacceptably slow convergence is hidden if we only look at the residual 
Tk = b — Axk. Instead of measuring the error x — in the solution, measures 
the error in the equation. That residual error does fall quickly. The key that led to 
multigrid is the rapid drop in r with such a slow drop in e. 

Figure 6.9: The residual b — Ax falls quickly but the solution error gets stalled. 



Gauss-Seidel and the Red-Black Ordering 



The Gauss-Seidel idea is to use the components of x"^* as soon as they are computed. 
This cuts the storage requirement in half, since x°^^ is overwritten by x"^™. The 
preconditioner P = D + L becomes triangular instead of diagonal (still easy to use): 



Gauss-Seidel iteration 


P = lower triangular part of A 









Gauss-Seidel gives faster error reduction than ordinary Jacobi, because the Jacobi 
eigenvalues cosj9 become (cosj^)^. The spectral radius is squared, so one Gauss- 
Seidel step is worth two Jacobi steps. The (large) number of iterations is cut in half, 
when the — I's below the diagonal stay with the 2's on the left side of Px"^^: 



Gauss-Seidel 
p^new ^ 

(P-A)a;°'^ + b 





Xi 

X3 



+ 2 



Xi 
X2 
X3 
Xi 



X2 

X3 
Xi 





old 



bi 
b2 

bi 



. (10) 



A new Xi comes from the first equation, because P is triangular. Using x"^* in the 
second equation gives x"^*, which enters the third equation. Problem 1 shows that 
^new _ (^cos j^)^x°'*^, with the correct eigenvector x"'*^. All the Jacobi eigenvalues 
cosj9 are squared for Gauss-Seidel, so they become smaller. 

Symmetric Gauss-Seidel comes from a double sweep, reversing the order of 
components to make the combined process symmetric. By itself, / — P^^A is not 
sjTiimetric for triangular P. 

A red-black ordering produces a neat compromise between Jacobi and Gauss- 
Seidel. Imagine that a two-dimensional grid is a checkerboard. Number the red nodes 
before the black nodes. The numbering will not change Jacobi's method (which keeps 
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all of X to create x"^™). But Gauss-Seidel will be improved. In one dimension, 
Gauss-Seidel updates all the even (red) components X2j using known black values. 
Then it updates the odd (black) components X2j+i using the new red values: 

X2j < — ^ {x2j^i + X2j+i + b2j) and then X2j+i < — ^ {x2j + X2j+2 + hj+i) ■ (H) 



In two dimensions, Xij is red when i + j is even, and black when i + j is odd. 
Laplace's five-point difference matrix uses four black values to update each center 
value (red). Then red values update black, giving one example of block Gauss-Seidel. 

For line Gauss-Seidel, each row of grid values forms a block. The preconditioner 
P is block triangular. That is the right way to see P in 2D: 



-^red — black 



4/ 

-I's 





4/ 



and P|ine G-S 



K+2I 
-/ K+2I 
-/ K+2I 



A great feature is the option to compute all red values in parallel. They need only 
black values and can be updated in any order — there are no connections among red 
values (or within black values in the second half-step). For line Jacobi, the rows in 2D 
can be updated in parallel (and the plane blocks in 3D). Block matrix computations 
are efficient. 

Overrelaxation (SOR) is a combination of Jacobi and Gauss-Seidel, using a 
factor uj that almost reaches 2. The preconditioner is P = D + uoL. (By overcorrecting 
from Xk to Xk+i, hand calculators noticed that they could finish in a few weeks.) My 
earlier book and many other references show how u is chosen to minimize the spectral 
radius p{M), improving p = 1 — cN~'^ to p{M) = 1 — cN~^. Then convergence is 
much faster (A^ steps instead of iV^, to reduce the error by a constant factor like e). 



Incomplete LU 

A different approach has given much more flexibility in constructing a good P. The 
idea is to compute an incomplete LU factorization of the true matrix A: 

Incomplete LU P = {approximation to L) {approximation to U) (12) 

The exact A = LU has flll-in. So does Cholesky's A = P7R. Zero entries in A 
become nonzero in L and U and R. But P = i^approxf^approx can keep only the flll-in 
entries F above a flxed tolerance. The MATLAB commands for incomplete LU are 

[ L, f/, Perm] = luinc(yl, tol) or i? = cholinc(yl, tol) . 

If you set tol = 0, those letters inc have no effect. This becomes ordinary sparse 
LU (and Cholesky for positive deflnite A). A large value of tol will remove all flll-in. 
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Difference matrices like K2D can maintain zero row sums by adding entries below 
tol to the main diagonal (instead of destroying those entries completely). This 
modified ILU is a success (mluinc and mcholinc). The variety of options, and 
especially the fact that the computer can decide automatically how much fill-in to 
keep, has made incomplete LU a very popular starting point. 

In the end, Pxk+i = {P—A)xk+h is too simple ! Often the smooth (low frequency) 
errors decrease too slowly. Multigrid will fix this. And pure iteration is choosing one 
particular vector in a "Krylov subspace." With relatively little work we can make a 
much better choice of Xk- Multigrid methods and Krylov projections are the state of 
the art in today's iterative methods. 

Problem Set 6.2 

1 If = (cos ^ sin cos^ .^^inf^,..., cos^ ^ sin f^) show that 
the Gauss-Seidel iteration (10) is satisfied with a;"^* = [cos^ 'W+l\ ^"'"^^ This 
shows that M for Gauss-Seidel has A = cos^ -^^-^ (squares of Jacobi eigenvalues). 



